Reward Design for an Online Reinforcement Learning Algorithm Supporting Oral Self-Care
نویسندگان
چکیده
While dental disease is largely preventable, professional advice on optimal oral hygiene practices often forgotten or abandoned by patients. Therefore patients may benefit from timely and personalized encouragement to engage in self-care behaviors. In this paper, we develop an online reinforcement learning (RL) algorithm for use optimizing the delivery of mobile-based prompts encourage One main challenges developing such ensuring that considers impact current actions effectiveness future (i.e., delayed effects), especially when has been designed run stably autonomously a constrained, real-world setting characterized highly noisy, sparse data. We address challenge designing quality reward maximizes desired health outcome high-quality brushing) while minimizing user burden. also highlight procedure hyperparameters building simulation environment test bed evaluating candidates using bed. The RL discussed paper will be deployed Oralytics. To best our knowledge, Oralytics first mobile study utilizing prevent motivational messages supporting
منابع مشابه
An Average - Reward Reinforcement Learning
Recently, there has been growing interest in average-reward reinforcement learning (ARL), an undiscounted optimality framework that is applicable to many diierent control tasks. ARL seeks to compute gain-optimal control policies that maximize the expected payoo per step. However, gain-optimality has some intrinsic limitations as an optimality criterion, since for example, it cannot distinguish ...
متن کاملReinforcement Learning for Automatic Online Algorithm Selection - an Empirical Study
In this paper a reinforcement learning methodology for automatic online algorithm selection is introduced and empirically tested. It is applicable to automatic algorithm selection methods that predict the performance of each available algorithm and then pick the best one. The experiments confirm the usefulness of the methodology: using online data results in better performance. As in many onlin...
متن کاملLoss is its own Reward: Self-Supervision for Reinforcement Learning
Reinforcement learning optimizes policies for expected cumulative reward. Need the supervision be so narrow? Reward is delayed and sparse for many tasks, making it a difficult and impoverished signal for end-to-end optimization. To augment reward, we consider a range of selfsupervised tasks that incorporate states, actions, and successors to provide auxiliary losses. These losses offer ubiquito...
متن کاملAn Average-Reward Reinforcement Learning Algorithm for Computing Bias-Optimal Policies
Computing Bias-Optimal Policies Sridhar Mahadevan Department of Computer Science and Engineering University of South Florida Tampa, Florida 33620 [email protected] Abstract Average-reward reinforcement learning (ARL) is an undiscounted optimality framework that is generally applicable to a broad range of control tasks. ARL computes gain-optimal control policies that maximize the expected pa...
متن کاملEvolved Intrinsic Reward Functions for Reinforcement Learning
The reinforcement learning (RL) paradigm typically assumes a given reward function that is part of the problem being solved by the agent. However, in animals, all reward signals are generated internally, rather than being received directly from the environment. Furthermore, animals have evolved motivational systems that facilitate learning by rewarding activities that often bear a distal relati...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Proceedings of the ... AAAI Conference on Artificial Intelligence
سال: 2023
ISSN: ['2159-5399', '2374-3468']
DOI: https://doi.org/10.1609/aaai.v37i13.26866